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Abstract. Intrusion Detection Systems (IDSs) are becoming essential 
to protecting modern information infrastructures. The effectiveness of 
an IDS is directly related to the computational resources at its disposal. 
However, it is difficult to guarantee especially with an increasing demand 
of network capacity and rapid proliferation of attacks. On the other hand, 
modern intrusions often come as sequences of attacks to reach some pre- 
defined goals. It is therefore critical to identify the best default IDS 
configuration to attain the highest possible overall protection within a 
given resource budget. This paper proposes a game theory based solu- 
tion to the problem of optimal signature-based IDS configuration under 
resource constraints. We apply the concepts of indices of power, namely, 
Shapley value and Banzhaf-Coleman index, from cooperative game the- 
ory to quantify the influence or contribution of libraries in an IDS with 
respect to given attack graphs. Such valuations take into consideration 
the knowledge on common attack graphs and experienced system attacks 
and are used to configure an IDS optimally at its default state by solving 
a knapsack optimization problem. 

Keywords: Intrusion Detection Systems, IDS Configuration, Coopera- 
tive Games, Shapley Value, Banzhaf-Coleman Index. 



1 Introduction 

The issue of optimal IDS configuration and provisioning has always been difficult 
to deal with, mainly due to the overwhelming number of parameters to tune. 
IDSs are generally shipped with a number of attack detection libraries (also 
known as categories [T3] or analyzers [T^]) with a considerable set of configu- 
ration parameters. The current version of the Snort IDS [13], for example, has 
approximately 10,000 signature rules located in fifty categories. Each IDS also 
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comes with a default configuration to use when no additional information or ex- 
pertise is available. It is not trivial to determine the optimal configuration of an 
IDS because it is essential to understand the quantitative relationship between 
the wide range of analyzers and tuning parameters. This explains the reason 
why current IDSs are configured and tuned simply based on a trial-and-error 
approach. Although there have been recent approaches, such as in P^[T51[^ . 
to optimize IDS resource consumption, we still need to deal with resource con- 
straints and make the best use of an IDS with available resource budgets. On 
the other hand, most of current computer attacks do not come in one shot but 
in several steps, by which attackers can acquire an increasing amount of knowl- 
edge and privileges to attack the target system. To describe such multi-stage 
behaviors, attack graphs or trees are commonly used as tools to model secu- 
rity vulnerabilities of a system and all possible sequences of exploits used by 
intruders. 

In this paper, we develop a novel game theory based solution to the problem 
of optimal default signature-based IDS configuration under resource limitations. 
The solution considers the costs and functionalities of libraries and defender's 
knowledge on common attack graphs to configure an IDS optimally at its default 
state. 

The contribution of this paper can be summarized as follows. We introduce 
the concept of detectability of an attack sequence with respect to a given set of 
IDS libraries and devise metrics to measure the detectability and the efficacy 
of detection. From a game theoretical perspective, we view a configuration as a 
coalition among libraries and apply the indices of power, namely, Shapley value 
and Banzhaf- Coleman index, to rank the overall importance of a library for the 
purpose of intrusion detection, which can be used in a knapsack problem for 
finding the optimal default configuration. In addition, we extend our results to 
general attack graphs based on multilinear extension and propose a scheme to 
approximate the indices of power when the number of libraries is large. 

The rest of the paper is organized as follows. In the next section, we sum- 
marize some recent related work on IDS configuration and cooperative games. 
In Section [3J we define the important notion of detectability and establish a 
mathematical model for attackers and detectors. In Section SI we formulate a 
cooperative game framework to evaluate the indices of power for a given attack 
sequence. In Section [5l we introduce multilinear extension as a general frame- 
work and an approximation technique to evaluate the indices of power. Finally, 
in Section ini we conclude the paper. 



2 Related Work 

We find a recent growing literature on performance characterization of IDSs in 
the computer science community. Some of the related work is summarized as 
follows. 
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2.1 IDS Performance Evaluation 

GafFney et al. in [4| use a decision analysis that integrates and extends Receiver 
Operating Characteristics (ROCs) to provide an expected cost metric. They 
demonstrate that the optimal operation point of an IDS depends not only on 
the system's own ROC curve and quantities such as the expected rate of false 
positives, false negatives, and the cost of operation, but also on the degree of 
hostility of an environment in which the IDS is situated, such as the probability 
and the type of an intrusion. Hence, the performance evaluation of an IDS has 
to take into account both the defender's side and attacker's side. 

In [23| . a network security configuration problem is studied. A nonzero-sum 
stochastic game is formulated to capture the interactions among distributed in- 
trusion detection systems in the network as well as their interactions against ex- 
ogenous intruders. The authors have proposed the notion of security capacity as 
the largest achievable payoff to an agent at an equilibrium to yield performance 
limits on the network security, and a mathematical programming approach is 
used to characterize the equilibrium as well as the feasibility of a given security 
target. 

Zhu and Ba§ar in |22| use a zero-sum stochastic game to capture the dynamic 
behavior of the defender and the attacker. The transition between different sys- 
tem states depends on the actions taken by the attacker and the defender. The 
action of the defender at a given time instant is to choose a set of libraries as 
its configuration, whereas the action of the attacker is to choose an attack from 
a set of possible ones. The change of configuration from one instant to the next 
implies for the defender to either load new libraries or features to the configu- 
ration or unload part of the current ones. The actions taken by the attacker at 
different times constitute a sequence of attacks used by the attacker. An online 
Q-learning algorithm is used to learn the optimal defense response strategies for 
the defender based on the samples of outcomes from the game. 

In this paper, we address the issues of optimal default configuration, which 
is complementary to the one addressed in 22 . We find an optimal configuration 
which can serve as an initial or starting profile for dynamic IDS configuration. 

To identify important factors for the performance of an IDS is another crucial 
investigation. In pijj, Schaelicke et al. observe several architectural and system 
parameters that contribute to the effectiveness of an IDS, such as operating 
system structure, main memory bandwidth and latency as well as the processor 
micro-architecture. Memory bandwidth and latency are identified as the most 
significant contributors to sustainable throughput. CPU power is important as 
well; however, it has been overlooked in the experiments due to the existence of 
other closely related architectural parameters, such as deep pipelining, level of 
parallelism, and caching. 

In P], the authors investigate the prediction of resource consumption based 
on traffic profile. An interesting result, which we assume to be available in this 
paper, is that both CPU and memory usage can be predicted with a model linear 
in the number of connections. Equally important is the confirmation that the 
factoring of IDS resource usage with per-analyzer and per-connection scaling is 
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a reasonable assumption. The authors use this finding to build an analyzer se- 
lection and configuration tool that estimates resource consumption per analyzer 
to determine whether a given configuration is feasible or not. The constraint 
used is a target CPU load below which the load should remain for a predefined 
percentage of time. The actual selection of a feasible analyzer set is however left 
as a manual task for the IDS operator. In our work, we propose a more informed 
and automated way of IDS configuration decision that takes into account the 
resource utilization per IDS library as well as the expected intrusion context 
based on experienced attack sequences or graphs. 

2.2 Attack Graphs 

The generation of attack graphs has received considerable attention in the litera- 
ture [51l71l5 1fTBlll7lll9ll21j . Sheyner et al. present in [TB1IT7] a tool for automatically 
generating attack graphs and performing different kinds of formal vulnerability 
analysis on them. Attack graphs have also been used in intrusion containment. 
Foo et al. develop in [5] the ADEPTS intrusion containment system in the con- 
text of E-commerce environments. The system builds a graph of intrusion goals, 
localizes intrusions, and deploys responses at the appropriate services to allow 
the system to work with minimum overall performance degradation. The system 
takes into consideration the financial impact of an attack and derives response 
actions that go beyond the simple deactivation or isolation of the infected ser- 
vice/host by considering interaction effects among multiple components of the 
protected environment. Finally, in |10j . attack graphs are used to derive optimal 
IDS placement in a network so as to minimize intrusion risk. The authors de- 
veloped the TVA tool (Topological Vulnerability Analysis), which can be used 
to model a network and populate it with information regarding vulnerabilities. 
The tool is claimed to have the ability to avoid state-space explosion through 
attack graph reduction. In this paper, we assume that such knowledge of attack 
graphs is given or has been acquired previously through experience. 

2.3 Game-theoretical Methods 

Game-theoretical methods appear to be an appropriate framework that connects 
the performance evaluation of an IDS with the attack sequences or graphs on 
the intruder's side. The concepts in cooperative game theory become natural to 
study the contribution of each IDS component to the attack sequence, especially 
when we view a configuration as a coalition among IDS libraries. 

Cooperative game theory studies the outcome of a game when coalitions 
are allowed among multiple players. The concepts of the core and stable sets 
are regarded as solutions to iV— person cooperative games. However, the lack 
of general existence theorem has led game theorists to look for other solution 
concepts. Currently, indices of power such as Shapley value, Banzhaf- Coleman 
index of power and their multilinear extensions have been widely used in a va- 
riety of literature involving resource allocation and estimation of power in a 
group of decision- making agents. In examples are given on the application 
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of indices of power in the analysis of presidential election games with a quan- 
titative conclusion that voters in some states are assuming more power than 
voters in other states in the election. In [1], Shapley value is used to allocate 
profit in a multi-retailer and a single supplier cooperative game where players 
can form inventory-pooling coalition. In [S] , an efficient measurement allocation 
in unattended ground sensor networks is suggested based on Shapley values. It 
is shown that by allocating measurements proportional to the Shapley value, the 
observability of localizing a target increases. A similar approach was also found 
to allocate unit start-up costs among electricity consumers, load and retailers. 

It appears that there has been very little work on using game theoretical 
methods to study IDS configurations. Similar to the problems involving resource 
allocations and presidential elections, cooperative game theory lends itself natu- 
rally also to studying the relations among libraries in an adversarial environment. 

3 Attacker and Detector Model 

We let £ = {li,l2, ■■■ ,In} denote the set of a finite number of libraries and C* 
denote the set of all the possible subsets of C, with cardinality |£*| = 2^. We let 
Fi S C* , i G {1, 2, • • • , 2^} be a configuration set of libraries, which is a subset 
of C Each library has a cost associated with it, i.e., there is a mapping function 
C : C ^ M+ that determines the cost of each library Ci = C{li). Assuming 
the cost of each library is independent of the others [2], we define the cost of 
a configuration Fi by Cp, = C*{Fi) = J2xgf, ^(a;), where C* : C* ^ M+ is a 
mapping function of configuration cost. 

The attacker, on the other hand, has different types of attacks a^. Let Ui G A 
be a specific action of attack and A = {ai,a2, • • • jUm} be the set of possible 
attacks. We define a sequence of attacks Si to be a tuple of elements of A, and 
.4* be the set of all possible sequences of attacks. The order of the elements in 
Si indicates a sequential strategy of an intrusion. Every attack ai G A incurs a 
damage di, given by the mapping function V : A ^ M+, i.e., di = 'D{ai),yai G A. 
Assuming that the damage caused by a sequence of attacks does not depend 
on the order of the sequence and the damage by one attack is independent of 
other attacks, we define the damage caused by an attack sequence Si G A* by 
Ds. =I?*(50 = E.6S.2?(x). 

Each library li can only effectively detect certain attacks. We define the set 
Pi- C ^ as its scope of detection. A library k is capable of detecting an attack 
ai if and only if G Pi-, otherwise the library li is sure to fail to detect. The 
definition of detectability of a library configuration follows from the scope of 
detection of each library. 

Without losing generality, we can further assume that H; e£ ^'■i ~ ^ because 
we can always define libraries to have functions that do not overlap with each 
other. This is particularly true in practice with signature-based libraries. 

Definition 1 An attack sequence Si is detectable by a library configuration Fi 
'if Si C Ti, where Ti :— Ui^^PiPik - attack sequence Si is undetectable by Fi if 
Si C Ti, where Ti := ^ \ Ti. 
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Based on Definition |31 we can separate an attack sequence Si into two sep- 
arate subsequences S° and S*, where 5*° is undetectable and S* is detectable. 
These two sequences satisfy the properties that they are mutually exclusive, i.e., 
S° U S* = S^ and 5*° n S* = 0. 

An example is given in Fig. [1] where we have a sequence of five attacks 
01,02,03,04 and 05. Detection library li can be used to monitor oi and 02 
effectively whereas libraries I2 and can only detect 03 and 04 respectively. 
However, 05 is alien to the detection system and no library can be used to 
detect 05 . An IDS will rely on the successful detection of earlier known attack 
stages to prevent the last unknown one. 




Fig. 1. A library configuration Fi that consists of libraries hjh, h is used to detect an 
attack sequence ai — >■ as. 



The definition of detectability assumes that each library can detect a certain 
signature or anomaly-based attack with success once it is loaded. However, due 
to many practical reasons such as delay and mutations of attacks, we can only 
successfully detect with some true positive (TP) rate. We use af^ to denote the 
probability of successful detection of an attack o^ G .A using library Ij and by 
definition afj ~ for i not in Pi. . The probability of undetected attacks when 
attacks occur, or the false negative (FN) rate, is thus given by afj = 1 — aij. 

We also provide a metric that measures the detectability of an attack se- 
quence Si with respect to a configuration Fj , and the efficiency of detection for 
Si. 

Definition 2 Let function v : A* — s> M 6e a value function defined on attacker's 
set of sequences, satisfying 

v{AiUA2)<v{Ai)+v{A2), (1) 

where Ai, A2 G A* . Given a library component Ij , its coverage Pj , and an attack 
sequence Si, we define detection effectiveness, rjij e [0,1], as follows: 

Remark 1 One simple choice of v is the cardinality of a set, i.e., v{Si) = 
card{Si). We can also have more complicated value functions; for example, we 
may have more weights on particular important attacks or final attacks. 
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Given a configuration F which consists of a finite number of libraries, we use 
definition in ^ to define detectability of an IDS configuration with respect to 
an attack sequence Si as follows. 

^ ^ -^KsT = <s-) ■ 

Using the concepts of TP/FN, we can weight our definition of detectability in ([3|) 
by true positive rate afj . Thus we have the definition of weighted detectability 
as follows. 

Definition 3 Given a configuration F and attack sequence Si, -weighted de- 
tectability is defined as 

V? - E "r.^.^- (4) 

The notion of detectability shows the effectiveness of detection configuration 
Fj with respect to attack sequence Si. We will later use detectability as a metric 
to optimize the performance of an IDS since higher detectability yields better 
detection results. 

On the other hand, we can also describe efficiency of a detection using value 
function v. 

Definition 4 Given an attack sequence Si and configuration F , we let Q de- 
scribe the efficiency of detection 

_ v{s, n T) _ ^ vjs, n Pfc) 

«(T) -^^^ viT) ' 
and let denote the weighted detection efficiency given by 

where T is a coverage of configuration F . 

Proposition 5 With ([IJ) and the metrics rji and Q defined in ^ and re- 
spectively, we have the following relation between these two metrics: 

- + ^<l,VteA*. (7) 

Proof. Using the definitions in (|3]), ([5]) and ([T]), we arrive at 

^ 1 ^ v{S,) + v{T) ^ vjS.UT) ^ 

ry, Q v{S, nT) - v{S,nT) - - ^ ' 

The inequality ([T]) provides a fundamental tradeoff relationship between de- 
tectability and efficiency. 
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4 Cooperative Game Model 

In this section, we review essential concepts of indices of power and use them 
in the context of optimal default IDS configuration. We can view each possible 
configuration as a coalition of different libraries and hence each library can be 
associated with an index of power, signifying its contribution to the detection of 
intrusions. 

We introduce the notion of w-effective detection to quantity the goal of in- 
trusion detection. 

Definition 6 We call a configuration F of an IDS lo— effective for attack se- 
quence Si if the detectability does not fall below lo, that isrji >lo, and (tu, a) — effective 
if the weighted detectability does not fall below uj, that is rjf > lj. 

Parameter a; is a level of intrusion detection performance an IDS wants to 
achieve. We call a configuration goal achieving if it is (oj, a)-effective, and un- 
satisfactory otherwise. 

4.1 Shapley Values and B-C index 

To describe an A^-person cooperative game using game-theoretical language, we 
let £ be the set of the players, and any subset of £, or a configuration F € C*, 
be a coalition. We let f : C* ^ {0, 1} be a characteristic function of the game 
having basic properties that 



By having the characteristic function / taking values and 1 only, we have 
defined a simple game. The value 1 from the mapping / of a coalition or con- 
figuration means a winning or goal achieving library configuration, whereas the 
value yields a non-winning or unsatisfactory library configuration. 

A carrier of the cooperative game is a library which does not contribute to 
any configurations to attain the goal of detection. Mathematically, a carrier is a 
coalition, Fc £ £*, such that f{F) — f{FnFc), for all F. We can always remove 
from our library list those dummy libraries which do not contribute or disregard 
them in our cooperative game when they are found to be carriers. 

The Shapley value of the i-th library li is given by for all li G C. 



The Shapley value 0i given in Q evaluates the contribution of each library 
toward achieving oj— effective detection. Since the characteristic mapping / only 
takes value in and I, Shapley value can be further simplified into 



(PI) /(0)=O; 

(P2) f{Fi U F2) > f{F,) + /(F2), Fi,F2 e C* and i^i n F2 = 0; 

(P3) f{{i}) = 0, for aU i e C. 

(P4) /(£) = 1. 





FI.CC 




(10) 



R'CC 



Optimal Default IDS Configuration 9 



where, for a given li, R' is the winning coalition such that the configuration can 
achieve w— effectiveness with li G R' , whereas R' — {li} fails to achieve the goal. 
With a smaller scale problem, Shapley value is relatively easy to compute. How- 
ever, in large problems, the evaluation of the weights can create computational 
overhead and the complexity increases exponentially with the size of the library. 
An easier index of power to compute is the Banzhaf-Coleman index of power, 
or B-C index, which depends on counting the number of swings, i.e., number of 
coalitions or configurations that wins when k is included but loses when is not. 

Definition 7 (B-C Index, IJJJ) The normalized Banzhaf-Coleman index l3i,\/li e 
£ is given by 

A = (11) 

where 9i is the number of swings for li ; a swing for li £ C is a set R <Z C such 
that R is a goal- achieving configuration if li £ R, and R — {li} is not. 

Shapley value and B-C index are closely related. They can both be evaluated 
by multilinear extension (see Section [S]). The difference lies in the weighting co- 
efficients used. In Shapley value, the weights are varied according to the coalition 
size, whereas in the B-C index, the weights are all equal. 

4.2 An Example 

Suppose we are given an attack sequence Si as depicted in Fig. [U where we have 
five attack actions ordered by ai — 02 — 03 — > 04 — > 05. There are 3 libraries 
and the sets Pii,i — 1, 2, 3 are given as follows: Pi^ — {/i, I2}, Pi^ = {h}, P13 = 
{Z4}. It is obvious that the sequence Si can only be partially detected as 05 is alien 
to the existing libraries of the IDS system. Suppose that each library has TP rates 
equal to 1 and v is the cardinality of the set. We can use Shapley value and B-C 
index to quantify the contribution to the detection of the sequence Si. Let uj = 
3/5. The set of swings for h, I2 and ^3 are {(Zi, ^2), (^1, ^3), (^1,^2, ^3)}, {(^1,^2)}, 
and {{h, h)}, respectively. The Shapley values are thus given by (pi = -^j 02 = ^, 
and 03 = i; and the B-C indices are thus Pi = ^, ^2 — ^, and /33 = i. To achieve 
w = I level of detection, li is most important and I2 and ^3 are equally important. 
Therefore, in terms of the priority of loading libraries, li should be placed first 
and then one should consider I2 and I3. Such evaluation via Shapley value and 
BC-index is useful for IDS system to assess the infiuence of each library and make 
decisions on which libraries to load initially when cost constraints are present. 

5 Multiple Attack Sequences and Multinear Extension 

In section |4l we introduced a cooperative game and proposed the concept of 
w— effectiveness to determine the winning or losing coalitions for a given known 
sequence. In this section, we extend this framework to deal with multiple coop- 
erative games with respect to different sequences in an attack graph. We look at 
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multilinear extensions in this section for two reasons: one is that it can be used 
to approximate the Shapley value when the library size grows; and the other 
reason is that it is a general framework that can yield B-C index. 



5.1 Multilinear Extension (MLE) 

A multilinear extension is a continuous function that can be used to evaluate 
Shapley value and B-C index as special cases. We let each library k G £ he 
associated with a continuous variable Xi £ [0,1], and fj be a characteristic 
function for detecting particular attack sequence Sj S A*. We now introduce 
the multilinear extension for an attack sequence S'j, denoted by hj, as follows: 

Definition 8 The multilinear extension of the cooperative game with character- 
istic function fj is a function hj : [0, 1]^ — > R++ given by 



hj{xi,X2r--,XN)^^<Y[-^^ n (l^^^)f/3 
RCC U.efl hec-R J 



(R). (12) 



The function hj can be used to evaluate Shapley's value by ([T^ below and 
B-C index by ^ below. 



ft: 



^ dhj{xi,X2, ■ ■ ■ ,xn) 



dxi 

dhj{xi,X2, - ■ ■ ,Xn) 



dxi 



dt, (13) 

Xi—t^X2—t,--- ,X]\[—t 

(14) 



where (j)ij and /3y- are the Shapley value and B-C index of library li for detecting 
sequence 5*^, respectively. The set M is a subset of A* that models a set of 
attacks known to detectors. 

To aggregate the effect of a library of detecting a set of sequences M C A* , 
we define an aggregated MLE /i as a sum of MLEs over all the sequences, as 
follows 

s^eM 

where pj is a weight on hj , indicating the frequency of occurrence of the attack 
sequence Si. It is a normalized parameter that satisfies pj > and J2s ^M Pj ~ 
1. 

Proposition 9 The Shapley value (f>i for detecting multiple sequences is given 
by 
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and B-C index /3i for detecting multiple sequences is given by 

where 0ij is the number swings for detecting sequence Sj . 

Proof. The proposition can be proved using the linearity of MLE. 

5.2 Multilinear Approximation 

As is well known, the multilinear extension in (jl2p has a probabilistic interpre- 
tation and it can be used to approximate the indices of power when the number 
of libraries grows large. We can view the variable Xi G [0, 1] as the probability 
of a library k in a random coalition S C C such that fj{S) = 1 when li £ S, 
and fj{S) — otherwise. Since the event that a library is in a random coali- 
tion is independent from the event of other libraries in a coalition, we have that 
the probability of forming the random coalition 5 as a particular coalition R is 
given by P{S = R) = O; 6_r II; ~ ^i)- The definition in can be 

interpreted as the expectation of f{S), i.e., hj {xi,X2, • • • , xpf) — IE(/j (S)). 
Let Zj be a random variable such that 

" \ if /, e £ - 5 

and let Y be another random variable defined byY — J2i es^j^ ^l es j^i ^J- 
Since Zj^s are independent, Y has the mean and variance 

j^i,ijes 

^\Y)= J2 v!^,{l^x,)- (20) 

Hence, hi{xi, ■ ■ ■ ,xm) is the probability that a coalition wins to be w-effective 
with respect to a set of sequences M but loses if U is removed from the coalition. 
From the definition of w-effectiveness, we can express hi as 

h,{xi,X2,--- ,XAr) =P(CJ <Y <UJ + 7^f) (21) 

When the size of the library grows, the random variable Y can be approximated 
by a normal random variable Y ^ with mean and variance given in (|19p and (|20p . 
Hence, 

h,{xi,X2, • • • , xw) = P (^w - i < r < + 7/f - . (22) 

The Shapley value can thus be computed from ([T^ and (fT5|) by hi{t, t, - ■ ■ ,t) 
with the random variable Y having the mean and variance ii{Y) = t^j_^^rij 
and (r) = t{l - 1) Vj, respectively. The B-C value can be approximated 
by evaluating hi{t,t, ■ ■ ■ ,t) a.t t = ^ using (P^ . 
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5.3 Optimal Default Configuration 

The indices of power rank tlie importance of each library from high to low. We 
can make use of these indices to make a decision on which libraries to load when 
the system is subject to some cost constraint Cq. Toward that end, we arrive at 
an integer programming problem as follows: 

J^uec^i'^i (23) 
z, e {0,1}, V/, e C 

We can use the B-C index as well in the objective function. The optimization 
problem can be viewed as a knapsack problem ^ . The knapsack problem is 
well-known to be NP-complete. However, there is a pseudo-polynomial time al- 
gorithm using dynamic programming and a fully polynomial-time approximation 
scheme, which invokes the pseduo-polynomial time algorithm as a subroutine. 



5.4 An Example 

In this section, we continue with the example in Section|4?2l but with an extended 
attack tree depicted in Fig. [21 The libraries that can be used for detection are 
h,h,l3 and I4 whose coverages are Pi^ — {ai,a2}, P12 — {aa^ar}, Ph — {'^4,08} 
and Pi^ = {og}, respectively. There are 4 known attack sequences in the attack 
tree. Let 5*1 be the sequence oi — > 02 — > 03 — >■ 04 — >■ 05; S2 denote ai — > 02 — > ag; 
5*3 denote ai — s> 02 ^ 03 — > aj; and ^4 be ai — 02 — >■ 03 — 04 — > ag. 




Fig. 2. Attack tree with attacks aj,j = 1,2, ■■ ■ ,8, and libraries li,i = 1,2,3,4, that 
are used to detect attacks. 



The Shapley values and B-C indices are summarized in the Table 1 and Table 
2, respectively. Suppose each library is equally expensive with 1 unit per library. 
With the capacity constraint being 2 units, we can load library li and I2 to 
optimize the default library. This choice is intuitively plausible because li and I2 
covers the major routes in the attack tree. I4 does not contribute to the result of 
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detection as much as other libraries, and when u! — ^, the impact of I4 becomes 
negligible when li is used. When the size of the tree grows, we need to evaluate 
indices of power in an automated fashion and use a polynomial-time algorithm 
to find the solution to the knapsack problem ((23)) . 
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Table 1. Shapley value for the attack tree 
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Table 2. B-C index for the attack tree 



6 Conclusion 



In this paper, we have adopted a cooperative game approach to study the in- 
fluence of each library when forming a configuration or a coalition to detect 
intrusions according to some known attack graphs. We have used the game ap- 
proach to connect the detector and the attacker, and developed novel notions of 
detectability and efficacy of detection. The paper has described the applications 
of Shapley value and B-C index in a combinatorial knapsack optimization prob- 
lem, which gives rise to an optimal configuration under the resource and cost 
constraints. The multilinear extension offers a technique to generalize the two 
indices discussed in the paper and, in addition, offers an approach to estimate 
these values when the number of libraries grows. 
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